Pokemon consolidation by SamBoasman · Pull Request #79 · UoA-CARES/gymnasium_envrionments

SamBoasman · 2025-03-06T01:31:29Z

Adds

Support for algorithms that don't require normalized actions
Reward overlay feature for video recordings
Additional action, reward and episode_reward data recording
Shell scripts for different algorithm executions
Support for discrete
Custom Pokemon docker file

…onsolidation

SamBoasman · 2025-03-06T01:32:22Z

Dockerfile

+RUN pip3 install -r requirements.txt
+
+WORKDIR /root
+RUN git clone https://github.com/PKWadsy/cares_pokemon_configs.git cares_rl_configs


Can probably pull from google drive with some reconfiguration.

yea, we should be putting ROMs on github

SamBoasman · 2025-03-06T01:32:55Z

requirements.txt

 pydantic==1.10.13
 torch==2.3.1
-pyboy==2.2.1
+pyboy==2.2.2


Test if v2.5.1 is usable.

SamBoasman · 2025-03-06T01:33:46Z

scripts/environments/pyboy/pyboy_environment.py

        return self.env.reset()

    def step(self, action: int) -> tuple:
+        # debug-log logging.info("Logging109")


Remove with logging import.

SamBoasman · 2025-03-06T01:34:23Z

scripts/run.py

 """

 import logging
+import os


Possible removal?

beardyFace · 2025-03-06T02:54:42Z

Dockerfile

+WORKDIR /workspace/cares_reinforcement_learning
+RUN git checkout -t origin/action-info-logging


the base docker for everything should be off the release versions

beardyFace · 2025-03-06T02:55:10Z

Dockerfile

+RUN pip3 install -r requirements.txt
+
+WORKDIR /root
+RUN git clone https://github.com/PKWadsy/cares_pokemon_configs.git cares_rl_configs


yea, we should be putting ROMs on github

beardyFace · 2025-03-06T02:55:34Z

scripts/environments/gym_environment.py

+    @abc.abstractmethod
+    def action_as_string(self, action):
+        raise NotImplemented("Override this method")
+


why is this required?

why is not implemented for all the other tasks?

Remove this at this level

beardyFace · 2025-03-06T02:56:57Z

scripts/train_loops/policy_loop.py

+            # Horrible hack so I don't have to change all the algorithms
+            select_action_from_policy = agent.select_action_from_policy
+
+            if "info" in inspect.signature(select_action_from_policy).parameters:
+                normalised_action = select_action_from_policy(
+                    state, noise_scale=noise_scale, info=step_data
+                )
+            else:
+                normalised_action = select_action_from_policy(
+                    state, noise_scale=noise_scale
+                )


beardyFace · 2025-03-06T02:59:45Z

shell-scripts/catch.sh

remove this entire folder

beardyFace · 2025-03-06T03:00:03Z

scripts/util/configurations.py

+    domain: Optional[str] = ""
+    display: Optional[int] = 0


remove the redundant Optional

beardyFace · 2025-03-06T03:00:11Z

scripts/train_loops/policy_loop.py

+            record.stop_video()
+            video_dir = os.path.join(record.directory, "videos")
+            data_dir = os.path.join(record.directory, "data")
+
+            run_csv = os.path.join(data_dir, f"episode_{episode_num}.csv")
+            pd.DataFrame(run_data_rows).to_csv(run_csv, index=False)
+
+            if episode_reward > highest_reward:
+
+                highest_reward = episode_reward
+
+                new_record_video = os.path.join(
+                    video_dir, f"new_record_episode_{episode_num+1}.mp4"
+                )
+                training_video = os.path.join(video_dir, "temp_train_video.mp4")
+
+                logging.info(
+                    f"New highest reward of {episode_reward}. Saving video and run data..."
+                )
+
+                try:
+                    os.rename(training_video, new_record_video)
+                except:
+                    logging.error("An error renaming the video occured :/")


beardyFace · 2025-03-06T03:00:22Z

scripts/train_loops/policy_loop.py

-        if (total_step_counter + 1) % number_steps_per_evaluation == 0:
-            logging.info("*************--Evaluation Loop--*************")
-            evaluate_policy_network(
-                env_eval,
-                agent,
-                train_config,
-                record=record,
-                total_steps=total_step_counter,
-                normalisation=normalisation,
-            )
-            logging.info("--------------------------------------------")


nope - this does not get removed

beardyFace · 2025-03-06T03:00:44Z

scripts/train_loops/policy_loop.py

Delete every single change made to policy loop - not happening

PKWadsy and others added 30 commits July 20, 2024 15:32

Fixed deep copy bug temp

7c3346b

ignored added shell script

311d388

Added fight script

9b3009e

Made pyboy env use sample action from env

d463495

Merge branch 'main' into pokemon-p4p-base

ec014aa

Added the discrete policy loop

c5e2ad8

Added discrete policy loop for pokemon and fight script

a707796

Merge branch 'main' into pokemon-p4p-base

77b9ff9

Removed discrete policy loop

eea9c31

Fixed import error

efb3ab9

Fixed policy loop

68220e9

Added discrete config

855c924

Added log comments and discretisation

e327a9b

Merge branch 'main' into pokemon-p4p-base

15f753f

Updated shell scripts

1277a7c

Made changes to gym which allow image overlay

128d0b8

Added more data saving - especially on highest reward

042b3e1

HUGE FIX

d58e537

renamed brock to flexi

f8d13d5

Set frames to stack back to 3

d99c5ec

Updated dockerfile and requirements

d20ad84

Updated dockerfile

bf94e9f

fixed dockerfile

3a4fea3

Fixed dockerfile

c4b0026

Fixed dockerfile

1c8676c

Removed eval and added better video saving

92d8cef

merged with main

5083be8

Docker file merge with main

c6ae3c5

Merge branch 'p4p-pokemon-docker' into docker-consolidation

6cb4625

Merge remote-tracking branch 'origin/docker' into docker-consolidation

be3f0ab

SamBoasman and others added 3 commits March 6, 2025 13:55

Merge remote-tracking branch 'origin/pokemon-p4p-base' into pokemon-c…

a9d3abe

…onsolidation

Merge branch 'main' into pokemon-consolidation

cec78e6

Auto-format code 🧹🌟🤖

25d5f32

SamBoasman commented Mar 6, 2025

View reviewed changes

requirements.txt

pydantic==1.10.13

torch==2.3.1

pyboy==2.2.1

pyboy==2.2.2

Copy link

Contributor Author

SamBoasman Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test if v2.5.1 is usable.

SamBoasman commented Mar 6, 2025

View reviewed changes

scripts/run.py

"""

import logging

import os

Copy link

Contributor Author

SamBoasman Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible removal?

beardyFace requested changes Mar 6, 2025

View reviewed changes

beardyFace marked this pull request as draft March 6, 2025 03:03

Update directory attribute acess for recording

83cd7c5

		WORKDIR /workspace/cares_reinforcement_learning
		RUN git checkout -t origin/action-info-logging

Conversation

SamBoasman commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SamBoasman commented Mar 6, 2025 •

edited

Loading